Search CORE

12 research outputs found

A Game-Theoretic Approach for Runtime Capacity Allocation in MapReduce

Author: Ardagna Danilo
Ciavotta Michele
Gianniti Eugenio
Passacantando Mauro
Publication venue
Publication date: 01/01/2017
Field of study

Nowadays many companies have available large amounts of raw, unstructured data. Among Big Data enabling technologies, a central place is held by the MapReduce framework and, in particular, by its open source implementation, Apache Hadoop. For cost effectiveness considerations, a common approach entails sharing server clusters among multiple users. The underlying infrastructure should provide every user with a fair share of computational resources, ensuring that Service Level Agreements (SLAs) are met and avoiding wastes. In this paper we consider two mathematical programming problems that model the optimal allocation of computational resources in a Hadoop 2.x cluster with the aim to develop new capacity allocation techniques that guarantee better performance in shared data centers. Our goal is to get a substantial reduction of power consumption while respecting the deadlines stated in the SLAs and avoiding penalties associated with job rejections. The core of this approach is a distributed algorithm for runtime capacity allocation, based on Game Theory models and techniques, that mimics the MapReduce dynamics by means of interacting players, namely the central Resource Manager and Class Managers

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Archivio della Ricerca - Università di Pisa

A Combined Analytical Modeling Machine Learning Approach for Performance Prediction of MapReduce Jobs in Hadoop Clusters

Author: Ardagna Danilo
Ataie Ehsan
Gianniti Eugenio
Movaghar Ali
Publication venue
Publication date: 01/01/2016
Field of study

Nowadays MapReduce and its open source implementation, Apache Hadoop, are the most widespread solutions for handling massive dataset on clusters of commodity hardware. At the expense of a somewhat reduced performance in comparison to HPC technologies, the MapReduce framework provides fault tolerance and automatic parallelization without any efforts by developers. Since in many cases Hadoop is adopted to support business critical activities, it is often important to predict with fair confidence the execution time of submitted jobs, for instance when SLAs are established with end-users. In this work, we propose and validate a hybrid approach exploiting both queuing networks and support vector regression, in order to achieve a good accuracy without too many costly experiments on a real setup. The experimental results show how the proposed approach attains a 21% improvement in accuracy over applying machine learning techniques without any support from analytical models

Archivio istituzionale della ricerca - Politecnico di Milano

Fluid Petri Nets for the Performance Evaluation of MapReduce Applications

Author: Ardagna Danilo
Barbierato Enrico
Gianniti Eugenio
Gribaudo Marco
Rizzi Alessandro Maria
Publication venue: 'European Alliance for Innovation n.o.'
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

PubliCatt

D-SPACE4Cloud: Towards Quality-Aware Data Intensive Applications in the Cloud

Author: Ardagna Danilo
Ciavotta Michele
Gianniti Eugenio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/9999
Field of study

The last years witnessed a steep rise in data generation worldwide and, consequently, the widespread adoption of software solutions claiming to support data intensive applications. Competitiveness and innovation have strongly benefited from these new platforms and methodologies, and there is a great deal of interest around the new possibilities that Big Data analytics promise to make reality. Many companies currently en- gage in data intensive processes as part of their core businesses; however, fully embracing the data-driven paradigm is still cumbersome, and es- tablishing a production-ready, fine-tuned deployment is time-consuming, expensive, and resource-intensive. This situation calls for novel models and techniques to streamline the process of deployment configuration for Big Data applications. In particular, the focus in this paper is on the rightsizing of Cloud deployed clusters, which represent a cost-effective alternative to installation on premises. We propose a novel tool, inte- grated in a wider DevOps-inspired approach, implementing a parallel and distributed simulation-optimization technique that efficiently and effec- tively explores the space of alternative resource configurations, seeking the minimum cost deployment that satisfies predefined quality of service constraints. The validity and relevance of the proposed solution has been thoroughly validated in a vast experimental campaign including different applications and Big Data platforms

Archivio istituzionale della ricerca - Politecnico di Milano

Performance Prediction of Cloud-Based Big Data Applications

Author: Almeida Jussara M.
Ana Paula Couto da Silva
Anna Guimarães
Ardagna Danilo
Barbierato Enrico
Evangelinou Athanasia
Gianniti Eugenio
Gribaudo Marco
Pinto Túlio B. M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

Big data analytics have become widespread as a means to extract knowledge from large datasets. Yet, the heterogeneity and irregular- ity usually associated with big data applications often overwhelm the existing software and hardware infrastructures. In such con- text, the exibility and elasticity provided by the cloud computing paradigm o er a natural approach to cost-e ectively adapting the allocated resources to the application’s current needs. However, these same characteristics impose extra challenges to predicting the performance of cloud-based big data applications, a key step to proper management and planning. This paper explores three modeling approaches for performance prediction of cloud-based big data applications. We evaluate two queuing-based analytical models and a novel fast ad hoc simulator in various scenarios based on di erent applications and infrastructure setups. The three ap- proaches are compared in terms of prediction accuracy, nding that our best approaches can predict average application execution times with 26% relative error in the very worst case and about 7% on average

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

PubliCatt

Performance Prediction of Deep Learning Applications Training in GPU as a Service Systems

Author: Danilo Ardagna
Eugenio Gianniti
Li Zhang
Marco Lattuada
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Data analysts predict that the GPU as a Service (GPUaaS) market will grow from US

700 million in 2019 to

7 billion in 2025 with a compound annual growth rate of over 38% to support 3D models, animated video processing, and gaming. GPUaaS adoption will be also boosted by the use of graphics processing units (GPUs) to support Deep learning (DL) model training. Indeed, nowadays, the main cloud providers already offer in their catalogs GPU-based virtual machines pre-installed with the popular DL framework (like Torch, PyTorch, TensorFlow, and Caffe) simplifying DL model programming operations. Motivated by these considerations, this paper studies GPU-deployed neural networks (NNs) and tackles the issue of performance prediction, particularly with respect to NN training times. The proposed approach is based on machine learning and exploits two main sets of features which describe, on one hand, the network architecture and the hyper-parameters, on the other, the hardware characteristics of the target deployment. Such data enable the learning of multiple linear regression models, which, coupled with an established feature selection technique, become accurate prediction tools, with errors below 11 % on average. An extensive experimental campaign, performed both on public and in-house private cloud deployments, considers popular deep NNs used for image classification and speech transcription and shows that prediction errors remain small even when extrapolating outside the range spanned by the input data. This has important implications for the models’ applicability: in this way, it is possible to investigate the impact on the performance of different GPUaaS deployment or hardware upgrades even without conducting an empirical investigation on the specific target device or to evaluate the changes in training time when the number of inner modules in the deep neural networks varies

Archivio istituzionale della ricerca - Politecnico di Milano

Fluid petri nets for the performance evaluation of mapreduce and spark applications

Author: Ardagna Danilo
Barbierato Enrico
Gianniti Eugenio
Gribaudo Marco
Rizzi Alessandro Maria
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Big Data applications allow to successfully analyze large amounts of data not necessarily structured, though at the same time they present new challenges. For example, predicting the performance of frameworks such as Hadoop and Spark can be a costly task, hence the necessity to provide models that can be a valuable support for designers and developers. Big Data systems are becoming a central force in society and the use of models can also enable the development of intelligent systems providing Quality of Service (QoS) guarantees to their users through runtime system reconfiguration. This paper provides a new contribution in studying a novel modeling approach based on fluid Petri nets to predict MapReduce and Spark applications execution time which is suitable for runtime performance prediction. Models have been validated by an extensive experimental campaign performed at CINECA, the Italian supercomputing center, and on the Microsoft Azure HDInsight data platform. Results have shown that the achieved accuracy is around 9.5% for Map Reduce and about 10% for Spark of the actual measurements on average

Archivio istituzionale della ricerca - Politecnico di Milano

PubliCatt

Optimal Resource Allocation of Cloud-Based Spark Applications

Author: Ardagna Danilo
Barbierato Enrico (ORCID:0000-0003-1466-0248)
Gianniti Eugenio
Lattuada Marco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Nowadays, the big data paradigm is consolidating its central position in the industry, as well as in society at large. Lots of applications, across disparate domains, operate on huge amounts of data and offer great advantages both for business and research. According to analysts, cloud computing adoption is steadily increasing to support big data analyses and Spark is expected to take a prominent market position for the next decade. As big data applications gain more and more importance over time and given the dynamic nature of cloud resources, it is fundamental to develop an intelligent resource management system to provide Quality of Service guarantees to end-users. This paper presents a set of run-time optimization-based resource management policies for advanced big data analytics. Users submit Spark applications characterized by a priority and by a hard or soft deadline. Optimization policies address two scenarios: i) identification of the minimum capacity to run a Spark application within the deadline; ii) re-balance of the cloud resources in case of heavy load, minimising the weighted soft deadline application tardiness. The solution relies on an initial non-linear programming model formulation and a search space exploration based on simulation-optimization procedures. Spark application execution times are estimated by relying on a gamut of techniques, including machine learning, approximated analyses, and simulation. The benefits of the approach are evaluated on Microsoft Azure HDInsight and on a private cloud cluster based on POWER8 by considering the TPC-DS industry benchmark and SparkBench. The results obtained in the first scenario demonstrate that the percentage error of the prediction of the optimal resource usage with respect to system measurement and exhaustive search is the range 4%-29% while literature-based techniques present an average error in the range 6%-63%. Moreover, in the second scenario, the proposed algorithms can address complex problems like computing the optimal redistribution of resources among tens of applications in less than a minute with an error of 8% on average. On the same considered tests, literature-based approaches obtain an average error of about 57%

Archivio istituzionale della ricerca - Politecnico di Milano

PubliCatt

Gray-Box Models for Performance Assessment of Spark Applications

Author: Alexandre Maros
COUTO da SILVA ANA PAULA
Danilo Ardagna
Eugenio Gianniti
Fabricio Murai
Jussara M. Almeida.
Marco Lattuada
Marjan Hosseini
Publication venue: 'Scitepress'
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

DICE H2020 Deliverable D3.9 — Final version

Author: Ardagna Danilo (5262301)
Gianniti Eugenio (5262295)
Kalwar Safia (5366281)
Rigoli Jacopo (5366278)
Publication venue
Publication date
Field of study

<p>The data hereby published supports the results of the European research<br> project DICE H2020 deliverable D3.9, which can be downloaded at http://www.dice-h2020.eu/deliverables/.<br> When referring to the data, please cite the above mentioned deliverable.</p

FigShare